This README file is a guide to the files disclosed for paper replication of Ho & Ruzic (2021)

Files:
1_Extract_ASM_CMF_from_SAS.sas
	- Combines data from the Annual Survey of Manufactures (ASM) and the Census of Manufactures (CMF), drawing on the input measures from the TFP version of the files from Foster, Grim and Haltiwanger (2016)
	- Brings in time-consistent NAICS industry codes (FK NAICS), based on the work of Fort and Klimek (2016)
	- Creates two stata-dataset outputs: CMF_ASM_1977_2001.dta and CMF_ASM_2002_2009.dta
	
2_Create_Stata_data_file.do
	- Combines the earlier outputs into CMF_ASM_1977_2009.dta
	- Filters 6-digit NAICS industry codes to ensure that they are sufficienty populated over time given the changing industry concordance over the period, constructs some baseline measures of establishments inputs, and outputs a file called Manufacturing_1977_2009.dta.
	- Creates value-added shares for each industry-year and outputs them into VA_Shares_fk.dta
	
3_Calculate_industry_labor_revenue_elasticities.do
	- Combines data from BLS National Compensation Survey and Manufacturing_1977_2009.dta to create industry-year labor shares that are inclusive of non-wage compensation	

4_Create_dataset_analysis_bootstrap.do
	- Creates a pared-down analysis dataset with only relevant variables for revenue-function estimation: estimation_dataset.dta
	- Indexes five-year estimation periods for each industry and saves them under estimation_pairs.dta
	
5a_Simple_and_bootstrap_estimations.do
	- Runs baseline revenue-function estimation in section 1 of the file without bootstrapping and saves the estimated coefficients in estimation_pairs_LP_ky.dta
	- Runs the same estimation in section 2 of the file, but this time with the specified $nreps number of bootstraps and save the coefficients in estimation_pairs_LP_ky_boots$nrep.dta. This step takes an incredibly long time.
	- Uses 5b_gmm_ky.do as the estimating program
	
5b_gmm_ky.do
	- Program for estimating the baseline revenue elasticities (betaK and betaY)
	- Requires the amoeba optimizer in Stata (search amoeba)
	
6a_Misallocation_shell.do
	- Specifies parameter values to use and calls files that estimate misallocation
	
6b_Misallocation_variables_and_trimming.do
	- Runs as part of 6a_Misallocation_shell.do
	- Trims extreme outliers in terms of physical productivity (TFPQ) and distortion measures (TAU)
	- Outputs Misallocation_Sample_`fname'.dta, where the file name `fname' is specified in 6a_Misallocation_shell.do

6c_Misallocation_Counterfactual.do
	- Runs as part of 6a_Misallocation_shell.do
	- Estimates misallocation using the baseline model with all establishments charging a common markup (CM) within the industry
	- Outputs Misallocation_Industry_CM_`fname'.dta, where the file name `fname' is specified in 6a_Misallocation_shell.do

6c_Misallocation_CF_VMarkups_AB.do
	- Runs as part of 6a_Misallocation_shell.do
	- Estimates misallocation using the Atkeson-Burstein variant of the model with establishments charging varying markups (VM) within the industry
	- Outputs Misallocation_Industry_VM_`fname'.dta, where the file name `fname' is specified in 6a_Misallocation_shell.do
	(This file is included for completeness. File 7 runs an updated version of the AB model estimation in which we drop the top 5% of observations by size.)

6d_Misallocation_Output.do
	- Outputs estimates of returns to scale and markups
	- Outputs estimates of misallocation
	- Decomposes bias in measured misallocation	

7_Within_Industry_Variation.do
	- Correlates changes in misallocation with changes in business dynamism
	- Runs updated estimation of the Atkeson-Burstein variant of the model where revenue-function parameters have been estimated using the bottom 95% of firms by size
	